Learning to design RNA polymers with graph kernels
نویسندگان
چکیده
RNA polymers have many functions in cells and are relatively easy to create. Finding ways to construct RNA sequences that correspond to a desired function is thus desirable. Conventionally inverse folding is used to tackle this problem. Inverse folding however is dependent on user provided very strict constraints. In this work I present an approach that will learn those constraints before generating new RNA molecules. [1] describes a general purpose approach to sample distributions of graphs. Metropolis Hastings sampling is used with a graph grammar to propose graphs and a graph kernel to aid the evaluation of proposed graphs. Since the described grammar is locally context-sensitive, long range constraints can not be represented in the grammar. If only local constraints are considered when proposing a graph, ignoring long range dependencies will increase the rejection rate of the MH algorithm. This is especially true for RNA secondary structure graphs. The structures themselves impose constraints on the graphs that cant be represented in a locally context-sensitive grammar. In this work I present a method which is using graph minors (a minor is obtained by contracting adjacent nodes in a graph) to enhance the grammars expressiveness.
منابع مشابه
Two new graphs kernels in chemoinformatics
Chemoinformatics is a well established research field concerned with the discovery of molecule’s properties through informational techniques. Computer science’s research fields mainly concerned by chemoinformatics are machine learning and graph theory. From this point of view, graph kernels provide a nice framework combining machine learning graph theory techniques. Such kernels prove their eff...
متن کاملkProlog: an algebraic Prolog for kernel programming
kProlog is a simple algebraic extension of Prolog with facts and rules annotated with semiring labels. We propose kProlog as a language for learning with kernels. kProlog allows to elegantly specify systems of algebraic expressions on databases. We propose some code examples of gradually increasing complexity, we give a declarative specification of some matrix operations and an algorithm to sol...
متن کاملA Graph-Based Clustering Approach to Identify Cell Populations in Single-Cell RNA Sequencing Data
Introduction: The emergence of single-cell RNA-sequencing (scRNA-seq) technology has provided new information about the structure of cells, and provided data with very high resolution of the expression of different genes for each cell at a single time. One of the main uses of scRNA-seq is data clustering based on expressed genes, which sometimes leads to the detection of rare cell populations. ...
متن کاملA Graph-Based Clustering Approach to Identify Cell Populations in Single-Cell RNA Sequencing Data
Introduction: The emergence of single-cell RNA-sequencing (scRNA-seq) technology has provided new information about the structure of cells, and provided data with very high resolution of the expression of different genes for each cell at a single time. One of the main uses of scRNA-seq is data clustering based on expressed genes, which sometimes leads to the detection of rare cell populations. ...
متن کاملTwo New Graph Kernels and Applications to Chemoinformatics
Chemoinformatics is a well established research field concerned with the discovery of molecule’s properties through informational techniques. Computer science’s research fields mainly concerned by the chemoinformatics field are machine learning and graph theory. From this point of view, graph kernels provide a nice framework combining machine learning techniques with graph theory. Such kernels ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015